April  2011 


BISHOP  AND  HODYSS 


1241 


Adaptive  Ensemble  Covariance  Localization  in  Ensemble  4D-VAR  State  Estimation 

Craig  H.  Bishop  and  Daniel  Hodyss 

Naval  Research  Laboratory,  Monterey,  California 


(Manuscript  received  1  March  2010,  in  final  form  3  December  2010) 

ABSTRACT 

An  adaptive  ensemble  covariance  localization  technique,  previously  used  in  “local”  forms  of  the  ensemble 
Kalman  filter,  is  extended  to  a  global  ensemble  four-dimensional  variational  data  assimilation  (4D-VAR) 
scheme.  The  purely  adaptive  part  of  the  localization  matrix  considered  is  given  by  the  element-wise  square  of 
the  correlation  matrix  of  a  smoothed  ensemble  of  streamfunction  perturbations.  It  is  found  that  these  purely 
adaptive  localization  functions  have  spurious  far-field  correlations  as  large  as  0.1  with  a  128-member  en¬ 
semble.  To  attenuate  the  spurious  features  of  the  purely  adaptive  localization  functions,  the  authors  multiply 
the  adaptive  localization  functions  with  very  broadscale  nonadaptive  localization  functions.  Using  the  Navy’s 
operational  ensemble  forecasting  system,  it  is  shown  that  the  covariance  localization  functions  obtained  by 
this  approach  adapt  to  spatially  anisotropic  aspects  of  the  flow,  move  with  the  flow,  and  are  free  of  far-field 
spurious  correlations.  The  scheme  is  made  computationally  feasible  by  (i)  a  method  for  inexpensively  gen¬ 
erating  the  square  root  of  an  adaptively  localized  global  four-dimensional  error  covariance  model  in  terms  of 
products  or  modulations  of  smoothed  ensemble  perturbations  with  themselves  and  with  raw  ensemble  per¬ 
turbations,  and  (ii)  utilizing  algorithms  that  speed  ensemble  covariance  localization  when  localization  functions 
are  separable,  variable-type  independent,  and/or  large  scale.  In  spite  of  the  apparently  useful  characteristics  of 
adaptive  localization,  single  analysis/forecast  experiments  assimilating  583  200  observations  over  both  6-  and 
12-h  data  assimilation  windows  failed  to  identify  any  significant  difference  in  the  quality  of  the  analyses  and 
forecasts  obtained  using  nonadaptive  localization  from  that  obtained  using  adaptive  localization. 


1.  Introduction 

General  background  on  the  need  for  localization  in 
ensemble  based  data  assimilation  can  be  found  in 
Houtekamer  and  Mitchell  (2001)  and  Hamill  et  al.  (2001). 
[See  Evensen  (2003)  for  a  review  of  ensemble  Kalman 
filter  research.]  When  Nonadaptive  Ensemble  Covariance 
Localization  (NECL)  is  used  in  ensemble  data  assimila¬ 
tion  (DA),  raw  ensemble  covariances  are  attenuated  by  a 
function  that  only  depends  on  the  physical  distance  be¬ 
tween  the  covarying  error  variables.  To  try  and  improve 
on  this  type  of  approach,  Bishop  and  Hodyss  (2007, 
2009a, b)  have  introduced  a  variety  of  Adaptive  En¬ 
semble  Covariance  Localization  (AECL)  techniques. 
These  approaches  allow  the  localization  or  attenuation 
functions  to  widen  (narrow)  as  the  true  error  correlation 
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function  widens  (narrows).  They  also  provide  localization 
functions  that  move  with  the  flow  so  that  observations  of 
variables  at  the  end  of  a  (long)  data  assimilation  time 
window  can  be  used  to  correct  upstream  variables  at  the 
beginning  of  the  data  assimilation  window. 

In  the  idealized  DA  models  examined  in  Bishop  and 
Hodyss  (2007, 2009a, b),  AECL  only  showed  a  significant 
benefit  over  NECL  in  systems  in  which  either  (i)  the 
width  of  the  true  error  correlation  function  was  a  strongly 
varying  function  of  space  or  time  and/or  (ii)  the  true  error 
correlation  function  moved  a  distance  greater  than  or 
equal  to  the  width  of  the  optimally  tuned  nonadaptive 
localization  function.  As  such,  an  outstanding  question 
is  whether  the  error  correlation  functions  associated 
with  modern  numerical  weather  prediction  (NWP)  models 
exhibit  enough  variability  for  AECL  to  be  significantly 
superior  to  NECL.  This  study  begins  to  address  this 
question  by  comparing  AECL  and  NECL  performance 
in  experiments  using  the  Navy  Operational  Global  At¬ 
mospheric  Prediction  System  (NOGAPS;  Hogan  and 
Rosmond,  1991). 
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To  illustrate  AECL  methods  within  the  context  of  a 
global  numerical  weather  prediction  model,  Bishop  and 
Hodyss  (2007,  2009b)  incorporated  them  into  a  large  ob¬ 
servation  volume  local  ensemble  Kalman  filter  (Ott  et  al. 
2004;  Hunt  et  al.  2007).  These  local  observation  volumes 
enabled  sophisticated  AECL  schemes  to  be  implemented 
in  a  cost-efficient  way.  This  paper,  together  with  the  com¬ 
panion  paper  Bishop  et  al.  (2011),  extends  this  work  by 
showing  how  AECL  can  also  be  cost  efficiently  included  in 
global  ensemble  four-dimensional  variational  data  assim¬ 
ilation  (4D-VAR).  The  motivation  for  this  extension  to 
a  nonlocal  global  framework  is  as  follows. 

Local  observation  volumes  are  inappropriate  for  long¬ 
time  window  data  assimilation  because  errors  are  liable 
to  propagate  out  of  observation  volumes.  Local  ap¬ 
proaches  limit  the  effectiveness  of  some  variational  tech¬ 
niques  for  bias  correction  and  the  estimation  of  forecast 
and  observation  error  variances.  There  is  a  growing  in¬ 
terest  in  the  use  of  localized  ensemble  covariances  in  three- 
dimensional  variational  data  assimilation  (3D- VAR)  and 
4D-VAR  schemes  evident  in  papers  by  Lorenc  (2003), 
Buehner  (2005),  Buehner  et  al.  (2010a, b),  Wang  et  al. 
(2007),  Liu  et  al.  (2009),  and  others.  Buehner  et  al. 
(2010a,b)  found  that  an  ensemble  4D-VAR  scheme  that 
used  NECL  but  did  not  use  a  tangent  linear  model  or  ad¬ 
joint,  outperformed  a  version  of  the  operational  4D-VAR 
scheme  both  in  the  tropics  and  Southern  Hemisphere,  but 
not  in  the  northern  extratropics. 

The  promise  of  improving  4D-VAR  with  ensemble 
covariances  is  a  particularly  strong  motivation  for  this 
study  because  at  the  authors’  research  laboratory,  a 
major  effort  has  been  underway  for  the  last  9  yr  to 
create  the  world’s  first  operational  weak  constraint  4D 
global  variational  data  assimilation  system  called  the 
Naval  Research  Laboratory  (NRL)  Atmospheric  Vari¬ 
ational  Data  Assimilation  System-Accelerated  Repre¬ 
senter  (NAVDAS-AR;  Xu  et  al.  2005).  NAVDAS-AR 
became  the  operational  data  assimilation  scheme  for 
global  model  atmospheric  forecasting  in  September  of 
2009.  NAVDAS-AR  is  the  only  observation  space -based 
4D-VAR  system  currently  used  for  operational  weather 
forecasting.  El  Akkraoui  et  al.  (2008)  refer  to  this  ob¬ 
servation  space  form  as  the  dual  form  of  4D-VAR. 

While  the  NAVDAS-AR  code  was  under  construc¬ 
tion  many  aspects  of  it  were  in  a  rapid  state  of  flux.  In 
addition,  the  authors  were  unsure  about  how  best  to  in¬ 
corporate  AECL  and/or  NECL  within  NAVDAS-AR.  For 
these  reasons,  a  decision  was  made  to  create  a  prototype 
observation  space  ensemble  4D-VAR  using  both  NECL 
and  AECL  outside  of  the  developing  NAVDAS-AR  code 
so  that  experimentation  on  the  prototype  could  inform 
the  final  implementation  of  NECL  and  AECL  within 
NAVDAS-AR.  This  paper  together  with  a  companion 


paper  (Bishop  et  al.  2011)  reports  on  the  results  of  these 
efforts.  This  paper’s  aims  are  to  give  the  first  demonstra¬ 
tion  of  the  incorporation  of  flow-adaptive  ensemble  co- 
variance  localization  in  a  global  4D-VAR  algorithm  and 
to  describe  a  new  ensemble  covariance  localization  that 
blends  nonadaptive  localization  and  adaptive  localization. 
The  aim  of  the  companion  paper  is  to  show  how  as¬ 
sumptions  about  localization  functions  such  as  separabil¬ 
ity,  variable-type  independence,  and  large  length  scales 
can  be  used  to  greatly  reduce  the  cost  of  ensemble  co- 
variance  localization  within  4D-VAR. 

Section  2  gives  the  (new)  partially  adaptive  localiza¬ 
tion  approach  and  describes  how  to  incorporate  it  in 
a  4D-VAR  scheme.  Section  3  describes  experiments 
that  serve  to  test  and  tune  the  palette  of  localization 
functions  provided  by  the  partially  adaptive  localization 
approach,  while  section  4  provides  a  comparison  of  the 
similarities  and  differences  of  optimally  tuned  adaptive 
and  nonadaptive  localization  functions.  Conclusions 
follow  in  section  5. 


2.  Partially  Adaptive  Ensemble  Covariance 
Localization 

a.  Definition  of  forecast  error  covariance  matrix  using 
PAECL 

The  Partially  Adaptive  Ensemble  Covariance  Local¬ 
ization  (PAECL)  forecast  error  covariance  model  con¬ 
sidered  in  this  paper  uses  both  an  adaptive  localization 
matrix  and  a  nonadaptive  localization  matrix  CN 
such  that 

PfP  =  Pf  QCAQCN=  ZZt  ©  (ZvZj  ©  zszj)  ©  (WWT). 

(1) 

where  O  indicates  the  element-wise  matrix  product.  In 
(1),  P7  =  ZZT  =  is  the  raw  sample  covariance 

matrix  of  a  K  member  ensemble,  zk  is  the  kth  column  of 
the  square  root  of  this  matrix  (typically,  zk  =  xk/V K  —  1 
where  xk  is  the  Mi  ensemble  perturbation  about  the 
ensemble  mean).  The  underlines  in  the  above  terms  in¬ 
dicate  that  they  pertain  to  a  set  of  states  separated  in  time. 
For  example,  if  a  12-h  data  assimilation  window  was  be¬ 
ing  considered  with  a  1-h  discretization  in  time  then 
*1  =  [XI (0)> (!)> x£ (2),  ...,xj(12)]  where  gives 
the  /r-vector  describing  the  state  of  the  Mi  ensemble 
perturbation  at  the  ith  discrete  time  tL  defining  the  data 
assimilation  window.  The  matrix  CA  =  (Z^Zj  ©  Z^Zj)  is 
the  AECL  matrix;  where  ZSZJ  =  X^=iZ-z-T  is  the  sample 
correlation  matrix  of  a  smoothed  normalized  ensemble. 
Details  of  a  generalized  method  for  obtaining  Zs  are  given 
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in  the  appendix.  While  CA  allows  for  variable-dependent 
localization,  in  this  paper  we  suppress  the  capability 
because,  using  the  techniques  discussed  in  Bishop  et  al. 
(2011),  doing  so  reduces  the  cost  of  AECL  by  more 
than  a  factor  of  4.  Here,  the  attenuation  of  the  raw 
covariance  between  two  variables  depends  only  on  the 
position  of  the  two  variables  in  space-time  and  not  on 
their  variable  type.  This  type  of  variable-type  indepen¬ 
dent  localization  was  achieved  by  letting  the  smoothed 
zst  fields  be  based  solely  on  horizontally  and  vertically 
smoothed  streamfunction  perturbations.  We  also  ex¬ 
perimented  with  letting  the  smoothed  z-  fields  be  based 
on  smoothed  equivalent  potential  temperature  per¬ 
turbations.  It  was  found  that  AECL  DA  performance 
was  insensitive  to  this  choice.  Note  that  the  number  of 
smoothed  members  L  may  be  different  to  the  number 
K  of  raw  members. 

Since  Z^Zj  is  constructed  from  smoothed  raw  en¬ 
semble  perturbations,  its  correlation  functions  will  tend 
to  have  larger  (shorter)  length  scales  in  regions  where  the 
raw  ensemble  correlation  length  scales  are  large  (short). 
In  addition,  the  correlation  functions  of  ZsZj  will  tend  to 
propagate  in  a  similar  way  to  the  raw  ensemble  correla¬ 
tion  functions.  Finally,  note  that  the  correlation  functions 
associated  with  the  adaptive  localization  (Z^Zj)  ©  (Z^Zj) 
can  be  widened  or  narrowed  simply  by  respectively  in¬ 
creasing  or  decreasing  the  amount  of  smoothing  used  to 
create  Z„. 

The  matrix  =  WW  -  X,„  =  iWmwJ,  is  the  NECL 
matrix,  W  is  its  square  root  and  wm  denotes  the  rath 
column  of  W.  We  assume  that  each  column  wm  is  rep¬ 
resented  in  terms  of  a  separable  product  of  a  horizontal 
structure  function  and  a  vertical  structure  function.  The 
horizontal  structure  functions  are  given  by  spherical 
harmonics  while  the  vertical  structure  functions  are 
given  by  cosine  functions.  The  4D  state  wm  is  obtained 
by  replicating  the  rath  3D-structure  function  for  each  of 
the  variables  and  times  that  define  the  dimension  and 
ordering  of  the  4D  state  under  consideration.  This  rep¬ 
lication  procedure  means  that  no  localization  is  performed 
in  time  or  with  respect  to  intervariable  covariances.  In 
general,  the  broader  the  localization  functions  defined  by 
CN,  the  fewer  the  number  of  columns  M  in  W. 

Note  that  PAECL  in  (1)  reduces  to  pure  AECL  when 
every  element  of  is  equal  to  1;  in  other  words,  AECL 
comes  from  PAECL  when  W  =  Wj  =1  (M  =  1)  so  that 
=  11T,  where  1  is  a  vector  of  ones  as  long  as  the  state 


vector.  Alternatively,  (1)  delivers  pure  NECL  when 
Z,  =  z*  =  1  (L  =  1)  so  that  CA  =  11T. 

From  the  square  root  theorem  given  in  Bishop  and 
Hodyss  (2009b)  and  its  more  general  form  given  in 
Bishop  et  al.  (2011),  (1)  may  be  written  in  the  following 
form: 


K  L  L  M 

p£  =  X  X X  X(z*  o o  zsi  0  wj(zfc  ©  z)  © z*  © wJT. 

k= 1  ;=1  i—  1  m— 1 

(2) 


In  other  words,  the  forecast  error  covariance  model 
P p  =  ZpZp  to  be  used  in  this  paper  is  equal  to  the  co- 
variance  of  a  KL2M  member  ensemble  of  perturbations 
of  the  form  zk  ©  z s-  0  zj  ©  wm  all  stored  as  columns  of 
the  matrix  Zp.  Each  of  these  members  represents  an 
element-wise  product  of  a  scaled  raw  ensemble  member 
zk  with  the  product  of  two  smoothed  and  normalized 
ensemble  members  z s-  ©  z\  and  a  nonflow-dependent 
structure  function  wm.  These  function  products  amount 
to  modulations  of  the  raw  ensemble  and  it  is  for  this 
reason  that  we  refer  to  the  KL2M  ensemble  implied  by 
(2)  as  a  modulation  ensemble. 

Whether  the  primal  or  dual  form  of  4D-VAR  is  imple¬ 
mented,  variational  schemes  require  the  repeated  evalua¬ 
tion  of  matrix  vector  products  like  Zpa  and  Zpb.  It  may  be 
shown  that 


K  L  L 


Zfa=II  X(z*  ®^©zD©  (Wa..fc) 


k= 1 /=1 i— 1 

i/T  r 


and 


(-/Jb)yt  =  W  [{zk  ©  zs-  ©  z-)  ©  b], 


(3) 


where  a./7<  and  (Zpb)^  are  both  M  vectors  listing  all  of 
the  elements  of  a  and  Zpb,  respectively,  which  are  as¬ 
sociated  with  the  same  values  of  i,  j,  and  k  such  that  aT  = 
[am,  aJn, . . . ,  a Jjk, . . . ,  a^LK]  and  (Zpb)  =  [(Zpb)m, 
(Zpb)J11..,  (Zpb)p..,  (Zpb)JLi^].  The  term  Wa^  may  be 
associated  with  a  smoothly  varying  set  of  weights  for  the 
structure  given  by  (zk  ©  z)  0  z-).  The  companion  paper 
(Bishop  et  al.  2011)  shows  how  the  separable  and  “repli¬ 
cated”  nature  of  the  columns  of  W  can  be  used  to  rapidly 
evaluate  the  right-hand  sides  of  (3). 

Since  z)  ©  zj  =  z \  0  z),  (2)  reduces  to  the  more  com¬ 
putationally  efficient  form: 


K  L  M  K  L—l  L  M 

p£  =  X  X  X(z,f  ©  tj  ©  tj  ©  Wm)(z(t  ©  tj  ©  tj  ©  Wm)T  +  2  X  X  X  X(z*  ©^©  Z;  ©  Wm)(zfc  ©  tj  ©  zj  0  W  J 


k— 1  /=1  m= 1 

zDz 


k= 1  7=1  i=  7+1  m=\ 


-P—P- 


(4) 
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Thus,  if  all  K  members  of  the  raw  ensemble  were  lin¬ 
early  independent  and  all  L  members  of  the  smoothed 
ensemble  were  linearly  independent  and  all  M  columns  of 
W  were  independent,  then  (4)  indicates  that  the  modu¬ 
lation  ensemble  contains  up  to  MK[L  +  L(L  —  l)/2]  = 
MK[L(L  +  l)/2]  linearly  independent  members.  The 
matrix  Zp  lists  these  members.  We  never  have  to  hold  all 
of  these  members  in  memory  as  they  can  easily  be  pro¬ 
duced  as  needed  via  element-wise  products. 

On  the  computers  currently  available  at  the  larger 
numerical  weather  prediction  centers,  it  is  feasible  to 
generate  128  member  ensembles  of  6-18-h  forecasts  in 
a  timely  manner.  Similarly,  it  is  feasible  to  smooth  each 
of  these  forecasts  in  a  timely  manner  to  obtain  L  =  K  = 
128  smooth  members.  Hence,  ensemble  sizes  for  which 
[K2(K  +  1)12 ]M  =  [1282  X  129/2 \M  =  1  056  768 M  are 
readily  obtained.  Global  numerical  weather  prediction 
models  will  soon  be  running  at  10-km  horizontal  reso¬ 
lution  with  100  vertical  levels  and  about  10  variables  at 
each  grid  point.  Such  models  will  have  approximately 
1010  variables.  Thus,  with  a  K  =  128  member  ensemble 
at  this  resolution  and  a  W  with  M  =  10  000  independent 
structure  functions,  Zp  could  have  as  many  linearly  in¬ 
dependent  columns  as  there  were  model  variables — 
even  at  this  very  high  resolution.  In  this  way,  (4)  re¬ 
moves  Lorenc’s  (2003)  concern  that  ensemble-based 
forecast  error  covariance  models  might  be  rank  deficient 
and  unable  to  fit  large  numbers  of  very  accurate  obser¬ 
vations. 

b.  Global  variational  solution  for  analysis  correction 

While  we  experimented  with  both  the  primal  and  dual 
forms  of  4D-VAR  and  found  that  they  both  gave  similar 
results,  the  results  shown  here  pertain  to  the  dual  or 
observation  space  form  of  4D-VAR  of  which  NAVD  AS- 
AR  is  an  example.  The  dual  form  can  be  derived  by 
noting  that  the  minimum  error  variance  estimate  xa  of  the 
true  4D  state  x  given  the  forecast  xf  and  observations  y  is 

\a=\f  +  P£HT(HP£HT  +  R)~x[y  -  H(\f)] 

=  xf  +  p£htr-1/2(r-1/2hp£htr-1/2  +  I)  'FTl/2[y 

-  H(xf )]  =xf  +  p£ht(HP£ht  +  iyl\y~H(V)] 

(5) 

assuming  that  the  forecast  and  observation  error  co- 
variance  matrices  are  accurate  (see  e.g.,  Xu  et  al.  2005; 
Daley  and  Barker  2001).  The  tildes  in  the  last  line  of  (5) 
indicate  that  the  matrices/vectors  have  been  premulti¬ 
plied  by  R-1//2.  To  avoid  the  high  computational  cost  of 
the  matrix  inverse  indicated  in  (5),  Daley  and  Barker 
(2001)  note  that  since  the  gradient  of  the  quadratic  form 


J  =  ^bT(HP£HT  +  l)b  -  bT[y  -  //(V)]  (6) 

with  respect  to  b  is 

^  =  (HP^HT  +  l)b-[T2^2)] 

=  (HZpZjHT  +  \)b-[i^H(V)l  (7) 

the  vector  bmin  that  minimizes  (6)  makes  dJ/db  =  0  so 
that 


bmin  =  (HP^HT  +  l)-1[y  (8) 

and  (5)  is  equivalent  to 

xW  +  P£HTbmin-  (9) 

The  vector  bmin  was  obtained  by  minimizing  (6)  using 
the  conjugate  gradient  technique.  The  conjugate  gradi¬ 
ent  technique  exhibited  steady  convergence  for  all  ex¬ 
periments  reported  here.  To  ensure  tight  convergence  in 
all  experiments,  128  iterations  of  the  conjugate  gradient 
inner  loop  were  performed  to  find  bmin. 

3.  Experimental  setup  for  tuning  of  localization 
functions 

a.  Generation  of  truth  run  and  pseudo -observations 

We  consider  an  idealization  in  which  we  let  the  true 
state  of  the  atmosphere  x*  be  defined  by  a  30-42-h 
forecast  using  NOGAPS  run  at  horizontal  spectral  res¬ 
olution  Til 9  with  30  vertical  levels  (T119L30).  The 
NAVD  AS  analysis  at  0000  UTC  23  June  2005  was  used 
to  initialize  this  truth  run.  NAVD  AS  (Daley  and  Barker 
2001)  is  the  3D-VAR  form  of  NAVDAS-AR  and  it  was 
used  for  operational  Navy  weather  forecasting  until 
September  2009  when  it  was  replaced  by  its  4D-VAR 
counterpart  NAVDAS-AR  (Xu  et  al.  2005).  The  first 
guess  x^  was  taken  to  be  a  6-18-h  forecast  valid  at  the  same 
time  as  the  30^12-h  forecast  used  to  generate  the  truth. 
The  initial  condition  for  this  first  guess  was  the  NAVD  AS 
analysis  0000  UTC  24  June  2005.  Since  we  wish  to  compare 
the  effect  of  adaptive  localization  in  both  6-  and  12-h  DA 
windows,  this  same  6-18-h  forecast  will  be  used  as  the 
first  guess  for  a  6-h  DA  window  running  from  0900  to 
1500  UTC  and  for  a  12-h  DA  window  running  from  0600 
to  1800  UTC. 

Figure  1  gives  the  vertical  variation  of  global  hori¬ 
zontal  averages  of  the  root-mean-square  (rms)  error  of 
this  first  guess  in  wind  and  temperature  at  1200  UTC. 
The  rms  wind  errors  exceed  4  m  s”1  at  the  jet  level  and 
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Fig.  1.  The  vertical  profile  of  the  root-mean-square  error  of  the 
prior  first  guess  at  1200  UTC  in  zonal  wind  (gray  line),  meridional 
wind  (black  line),  and  temperature  (dashed  line).  The  vertical  axis 
indicates  the  models  vertical  cr-level  index  (level  30  is  the  earth’s 
surface)  while  the  horizontal  axis  indicates  speed  (ms-1)  for  the 
wind  components  and  temperature  (K)  for  the  temperature. 


in  the  stratosphere  while  rms  temperature  errors  exceed 
1  K  throughout  much  of  the  depth  of  the  atmosphere. 
Note  that  this  “error”  is  entirely  due  to  the  NAVDAS 
assimilation  of  observations  over  the  24-h  period  pre¬ 
ceding  the  initialization  of  the  first-guess  run.  Hence,  the 
structure  of  the  difference  between  the  two  states  is 
partially  determined  by  the  quasi-isotropic  error  corre¬ 
lation  functions  of  NAVDAS  and  partially  determined 
by  the  atmospheric  perturbation  dynamics  of  both  the 
atmosphere  and  NOGAPS. 

Pseudo-observations  can  be  created  at  any  desired 
location  by  adding  random  numbers  representative  of 
“observation  error”  to  variables  from  the  true  state  se¬ 
lected  for  “observation.”  Two  types  of  observational 
network  are  considered.  The  first  pertains  to  a  6-h  DA 
window  and  features  194  400  observations  of  zonal  and 
meridional  wind  ( u ,  v)  and  temperature  T  at  0900, 1200, 
and  1500  UTC  for  a  total  of  583  200  observations.  We 
shall  hereafter  refer  to  this  observational  network  as  the 
6-h  network.  The  second  network  is  almost  identical 
except  it  pertains  to  a  12-h  DA  window  and  features 
194  400  observations  of  zonal  and  meridional  wind  ( u ,  v) 
and  temperature  T  at  0600, 1200,  and  1800  UTC  and  will 
be  referred  to  as  the  12-h  network. 

The  standard  deviation  of  observation  error  was  2  ms"1 
for  the  wind  components  and  1  K  for  temperature.  For 
simplicities  sake  and  to  check  whether  adaptive  localiza¬ 
tion  provides  any  clear  systematic  benefit  when  the  ob¬ 
serving  network  is  quasi-isotropic  and  quasi-homogeneous, 
we  let  the  observations  subsample  the  T119L30  grid  at 


every  third  point  in  both  the  zonal  and  meridional  di¬ 
rections.  In  addition,  no  observations  were  included  be¬ 
tween  the  South  Pole  and  80°S  and  the  North  Pole  and 
80°N.  In  the  vertical,  observations  were  at  every  third 
sigma  level  (for  a  total  of  10  levels)  beginning  at  the  sigma 
level  immediately  above  the  terrain  pressure. 

b.  Generation  of  ensemble 

A  128-member  ensemble  was  generated  using  the 
technique  described  in  McLay  et  al.  (2010)  that  is  a 
modification  of  the  ensemble  transform  (ET)  technique 
described  in  McLay  et  al.  (2008).  The  modification  is 
that  in  McLay  et  al.  (2010)  transformation  coefficients 
are  computed  for  five  distinct  latitudinal  bands  and  then 
interpolated  between  the  centers  of  the  bands  whereas 
in  McLay  et  al.  (2008)  only  one  set  of  global  trans¬ 
formation  coefficients  are  computed.  As  pointed  out  in 
McLay  et  al.  (2008),  the  ET  technique  (Bishop  and  Toth 
1999)  is  a  variation  on  Toth  and  Kalnay’s  (1997)  breeding 
technique  and  as  such  must  be  cycled  for  a  few  days  in 
order  to  “breed”  growing  structures.  To  this  end,  the  128- 
member  ET  ensemble  was  cycled  for  6  days  preceding  the 
first  DA  window. 

Because  of  the  way  they  are  generated,  the  ET  en¬ 
semble  perturbation  structure  is  entirely  independent 
of  the  error  covariance  functions  used  to  define  the 
NAVDAS  forecast  error  covariance  model.  Consequently, 
the  ET  perturbations  are  not  particularly  well  configured 
to  correct  the  difference  between  the  experiments  first 
guess  and  pseudotruth  because  this  difference  is  partially 
determined  by  the  NAVDAS  forecast  error  covariance 
functions.  Mismatches  between  the  distribution  from 
which  forecast  errors  are  drawn  and  the  distribution 
from  which  ensemble  perturbations  are  drawn  must  also 
occur  in  operational  ensemble  DA  systems  when  known 
and  unknown  sources  of  model  error  are  not  accurately 
accounted  for.  Hence,  the  mismatch  between  the  dis¬ 
tribution  from  which  our  forecast  error  is  drawn  and  that 
from  which  our  ensemble  perturbations  is  drawn  has 
some  very  crude  similarities  to  what  occurs  in  opera¬ 
tional  ensemble  DA  schemes.  When  such  distribution 
mismatches  are  present,  ensemble  covariance  localiza¬ 
tion  has  two  roles:  to  suppress  spurious  ensemble  cor¬ 
relations  and  to  create  variance  in  directions  of  error 
that  are  not  well  sampled  by  the  ensemble  generation 
process. 

c.  Tuning  of  localization  functions 

We  compare  three  types  of  localization:  NECL,  where 
Pp  =  ZZT  ©  WWT;  AECL,  where  Pfp  =  ZZT  ©  ZvZj © 
ZvZj :  and  PAECL,  where  Pp  =  P-^0  CA  ©  Cv  = 
ZZT  ©  (ZsZj  ©  ZsZj)  ©  (WWT).  The  width  of  the  cor- 
relation  functions  associated  with  the  NECL  matrix 
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cN  =  WWT  are  adjusted  via  two  length  parameters  that 
control  the  width  of  the  horizontal  and  vertical  functions 
that  define  the  columns  of  C^,  while  the  width  of  the 
correlation  functions  associated  with  the  AECL  matrix 
Q.a  —  ZsZj  ©  ZsZj  is  controlled  by  varying  two  smooth¬ 
ing  parameters  that  control  the  strength  of  the  horizontal 
and  vertical  smoothing  applied  to  turn  the  raw  ensemble 
perturbations  ZK  into  the  smoothed  ensemble  perturba¬ 
tions  Zs.  In  principle,  these  parameters  can  be  tuned  to 
optimize  any  given  cost  function.  We  chose  to  tune  them 
to  optimize  the  cost  function 

f  K°rst(Q]2  ,  RToF 

lKT(i2)]2  K"or(i2)]2 

J5D  do 

[rp™r(i2)]2J 


■^observed 


at  the  analysis  time  t  =  12,  where  (t)]2,  [i»cnSt(t)]2 
and  [TP°st(f)]2  give  the  global  average  of  the  square  of 
the  posterior  error  of  the  state  estimate  in  the  tropo¬ 
sphere  of  zonal  wind  u ,  meridional  wind  u,  and  tem¬ 
perature  T  at  1200  UTC  after  the  assimilation  of 
observations  from  the  6-h  network.  At  the  optimization 
time  t  =  1200  UTC,  [«!T(r)]2,  Kf(r)]2,  and  [7^f(r)]2 
pertain  to  the  error  of  the  analysis  obtained  from  (9). 
At  later  times,  t  >  1200  UTC,  [w££st(r)]2,  [tC‘(f)]2,  and 
[TP°st(f)]2  pertain  to  the  error  of  the  nonlinear  forecast 
initialized  with  the  1200  UTC  analysis.  The  symbols, 
[«cnSt(r)]2,  KnSt(r)]2,  and  [TP°st(?)]2  give  the  global  av- 
erage  in  the  troposphere  of  the  square  of  the  prior  error 
or  first-guess  error  of  the  12-h  forecast  initialized  at 
0000  UTC  24  June  2005  of  zonal  wind  u ,  meridional 
wind  v ,  and  temperature  T  before  the  assimilation  of 
observations.  Hence,  /Observed(0  gives  the  average  frac¬ 
tional  reduction  in  the  error  variance  of  the  observed 
variables  due  to  the  assimilation  of  observations.  Al¬ 
though  not  used  for  tuning  the  localization  functions,  we 
also  measured  the  fractional  reduction  in  error  variance 
of  the  unobserved  variables  using 


J 


unobserved 


[(pXTU)}2 

[(pxri°r(i2)]2 


[gffW  1 

[<?crr°r(12)]2J  ' 

(11) 


where  the  meaning  of  the  overbars,  subscripts,  and  su¬ 
perscripts  are  the  same  as  in  (10),  but  ps  and  q  denote  the 
unobserved  variables  of  surface  pressure  and  specific 
humidity,  respectively. 

To  ensure  that  global  averages  gave  equal  weight 
to  areas  with  the  same  surface  area,  gridpoint  errors 
were  multiplied  by  the  cosine  of  their  latitude  before 


inclusion  in  the  global  sum  in  order  to  account  for  the 
fact  that  the  distance  between  NO  GAPS  Gaussian  grid 
points  decreases  with  the  cosine  of  latitude.  To  make 
sure  that  the  errors  pertained  primarily  to  the  tropo¬ 
sphere,  only  the  lower  23  model  levels  corresponding 
to  the  atmosphere  around  100  hPa  and  below  were  in¬ 
cluded. 

As  previously  mentioned,  each  of  the  localization 
methods  was  tuned  to  minimize  the  state  estimation 
error  at  the  analysis  time  for  the  6-h  data  assimilation 
window.  To  see  the  effect  of  lengthening  the  observation 
window,  the  localization  tunings  that  minimized  analysis 
error  variance  of  the  observed  variables  for  the  6-h 
network  were  then  used  to  assimilate  observations  from 
the  12-h  network. 

Figure  2a  depicts  the  evolution  of  the  state  estima¬ 
tion  error  as  measured  by  /Observed(0  [(9)]  from  the 
analysis  time  out  to  a  5 -day  forecast  for  all  of  the  lo¬ 
calization  times  and  for  both  the  6-  and  12-h  observa¬ 
tional  networks.  Figure  2b  depicts  the  evolution  of  the 
corresponding  measure  of  the  error  in  the  unobserved 
variables  /un observed(0*  Figures  2a, b  show  that  after 
tuning  the  differing  localization  methods  NECF,  AECE, 
and  PAECE  all  give  very  similar  5 -day  forecast  errors 
for  both  observed  and  nonobserved  variables  and  for 
both  the  6-  and  12-h  observational  networks.  All  of  the 
approaches  profoundly  reduce  state  estimation  error. 
Figure  2a  (Fig.  2b)  shows  that  it  takes  about  96  h  (84  h) 
for  the  forecast  error  variance  in  the  observed  (un¬ 
observed)  variables  to  exceed  the  corresponding  forecast 
error  variance  of  the  first  guess.  For  both  the  6-  and  12-h 
networks,  the  analysis  error  obtained  from  AECL  is 
somewhat  larger  than  that  from  NECL  and  PAECL. 
However,  for  the  12-h  network  in  particular,  the  5-day 
forecast  error  from  AECL  is  smaller  than  NECL  and 
PAECL.  However,  none  of  these  differences  are  large 
enough  to  be  statistically  significant. 

An  interesting  feature  of  Fig.  2b  is  that  the  error  of  the 
forecast  of  the  unobserved  terrain  pressure  and  humid¬ 
ity  variables  decreases  for  the  first  12  h  of  the  forecast. 
We  speculate  that  this  improvement  is  partially  a  result 
of  the  dynamics  of  the  nonlinear  model  bringing  the 
unobserved  variables  into  an  improved  dynamical  bal¬ 
ance  with  the  observed  variables.  As  noted  by  Kepert 
(2009),  while  one  could  expect  analyses  made  with  un¬ 
localized  ensemble  covariances  to  satisfy  linear  balance 
constraints  at  large  scales,  when  localization  is  em¬ 
ployed,  imbalance  is  to  be  expected  at  all  scales. 

Using  128  Power  5  +  IBM  processors  at  1.9  GHz,  it  was 
found  that  it  took  53  and  73  min  to  obtain  the  1200  UTC 
analysis  using  NECL  and  PAECL,  respectively.  This 
time  was  the  same  for  both  the  6-  and  12-h  DA  windows 
because  the  total  number  of  observations  was  the  same 
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Fig.  2.  Normalized  root-mean-square  posterior  error  as  a  function  of  forecast  time.  Blue,  red,  and  green  curves 
pertain  to  nonadaptive,  partially  adaptive,  and  purely  adaptive  localization.  Dashed  and  solid  curves  pertain  to  the  6- 
and  12-h  data  assimilation  window,  respectively,  (a)  The  results  for  the  mean  of  the  normalized  observed  variables  u, 
v,  and  T  [(10)].  (b)  The  corresponding  results  for  the  normalized  unobserved  variables  ps  and  q  [(11)]. 


for  both  of  these  experiments.  The  relatively  small  dif¬ 
ference  in  computational  cost  is  due  to  the  fact  that  the 
tight  nonadaptive  localization  (Fig.  6)  required  by 
NECL  corresponds  to  a  nonadaptive  localization  matrix 
W  that  has  many  more  columns  than  the  number  of 
columns  in  the  W  corresponding  to  the  very  broad  local¬ 
ization  (Fig.  5)  used  for  PAECL.  As  noted  in  Bishop  et  al. 
(2011),  the  cost  of  localization  can  be  reduced  if  one  de¬ 
fines  the  localization  matrix  on  a  coarser-resolution  grid 
than  the  raw  ensemble  perturbations.  This  method  of 
reducing  computational  cost  was  not  employed  in  our 
calculations.  If  we  had  employed  a  reduced-resolution 
grid  that  had  an  order  of  magnitude  fewer  points  than  our 
T119L30  grid,  then  one  would  expect  an  order  of  mag¬ 
nitude  reduction  in  the  amount  of  time  required  for  each 
analysis.  Were  it  not  for  the  fact  that  the  optimal  tuning 
for  NECL  gave  the  NECL  covariance  matrix  many  more 
columns  than  that  of  the  PAECL  matrix,  the  margin  by 
which  NECL  was  faster  than  PAECL  would  have  been 
even  greater. 

Since  our  adaptive  localization  functions  are  built 
from  ensemble  perturbations,  the  quality  of  the  adaptive 
localization  is  closely  linked  to  the  quality  of  the  en¬ 
semble  perturbations.  If  assimilating  real  data,  this 
quality  depends  on  how  well  the  ensemble  generation 
scheme  accounts  for  all  sources  of  error.  Accounting  for 
unknown  sources  of  model  error  is  very  difficult.  In  our 
experiment,  we  modeled  the  error-ensemble  mismatch 
using  distinctly  different  methods  to  create  a  pseudo- 
first-guess  error  and  the  ensemble  perturbations.  The 
mismatch  created  by  our  approach  may  be  greater  than 
that  between  actual  forecast  error  and  ensemble  per¬ 
turbations  from  state-of-the-art  ensemble  generation 
schemes.  Consequently,  the  potential  benefit  of  adaptive 


localization  could  be  greater  than  that  suggested  by  our 
results. 

The  results  from  Bishop  and  Hodyss  (2007,  2009a) 
indicate  that  adaptive  localization  is  only  likely  to  de¬ 
liver  large  benefits  when  true  error  correlation  functions 
propagate  a  significant  distance  relative  to  the  width 
of  the  localization  functions  and/or  when  the  true  error 
correlation  length  scales  varies  markedly  from  one  lo¬ 
cation  to  another.  If  the  primary  reason  for  the  neutrality 
of  our  results  is  short  error  propagation  distances  over  6- 
12-h  DA  windows  and  weak  spatial  inhomogeneity  of  the 
true  error  correlation  length  scale,  then  the  neutrality  of 
our  results  is  unlikely  to  change  with  an  improved  en¬ 
semble  model  of  the  forecast  error  distribution. 

4.  The  PAECL  palette  of  localization  functions 

As  previously  mentioned,  adaptive  localization  is 
likely  to  be  beneficial  if  covariance  functions  move 
a  significant  distance  relative  to  the  localization  length 
scale  and/or  if  the  true  error  correlation  length  scale  is 
highly  inhomogeneous.  In  this  section,  we  compare  and 
contrast  the  effect  of  adaptive  and  nonadaptive  locali¬ 
zation  functions  at  points  on  the  globe  where  such 
characteristics  are  likely  to  be  found. 

To  find  a  point  where  the  true  error  correlation 
function  is  likely  to  exhibit  significant  propagation,  we 
searched  for  the  maximum  value  of  zonal  wind  at 
(7-level  15  (around  400  hPa)  at  1200  UTC.  The  location 
of  this  maximum  value  in  zonal  wind  was  found  to  be  at 
40°S,  90°E.  This  (winter)  region  is  typically  characterized 
by  strong  baroclinic  zonal  jets  so  it  was  no  surprise  to  find 
the  strongest  zonal  wind  at  this  point.  Figure  3  gives 
horizontal  and  vertical  perspectives  of  the  unlocalized 
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Fig.  3.  Unlocalized  ensemble  covariance  function  of  meridional  wind  at  1800  and  1200  UTC  with  1800  UTC 
meridional  wind  variables  at  40°S,  90°E  and  cr-level  15  (about  400  hPa).  The  ensemble  has  128  members.  The  hor¬ 
izontal  cross  sections  at  cr-level  15  of  the  covariance  function  at  (a)  1800  and  (b)  1200  UTC.  The  zonally  oriented 
vertical  cross  sections  at  40°S  of  the  covariance  function  at  (c)  1800  and  (d)  1200  UTC. 


ensemble  covariance  function  of  meridional  wind  at 
1800  (Figs.  3a, c)  and  1200  UTC  (Figs.  3b, d)  with  a  single 
1800  UTC  meridional  wind  variable  located  at  40°S, 
90°E.  The  function  corresponds  to  a  column  of  the  un¬ 
localized  ensemble  covariance  matrix  =  ZZT.  This 
particular  column  was  chosen  because  its  diagonal  ele¬ 
ment  corresponds  to  a  local  maximum  in  the  westerly 
zonal  wind.  The  structure  of  this  function  is  identical  to 
that  of  the  single  observation  correction  that  would  be 
obtained  from  assimilating  an  observation  at  1800  UTC 
of  meridional  wind  at  the  model  grid  point  at  40°S,  90°E 
and  cr-level  15  (about  400  hPa).  The  horizontal  cross 
section  of  the  1800  UTC  covariance  function  shown  in 
Fig.  3a  features  a  positive  region  centered  around  40°S, 
90°E  and  cr-level  15  with  negative  regions  to  the  east  and 
west  of  this  central  region.  Because  this  pattern  is  not 
unlike  that  of  a  baroclinic  wave  packet  (Zimin  et  al. 
2003),  it  is  conceivable  that  the  wavelike  pattern  of 


covariance  stretching  from  65°  to  105°E  is  not  spurious. 
However,  it  would  be  more  difficult  to  argue  that  the 
undulations  between  50°  and  60°S  together  with  those 
equatorward  of  15°S  represent  nonspurious  covariances. 
Figure  3b  shows  that  the  1200  UTC  covariance  function 
is  broadly  similar  to  the  1800  UTC  covariance  function 
except  that  it  lies  8°-10°W  (upstream)  of  the  1800  UTC 
covariance  function.  This  is  unsurprising  given  the 
strong  westerly  winds  associated  with  this  location. 

Figures  3c, d  give  the  corresponding  vertical  structure 
at  1800  and  1200  UTC,  respectively.  Interestingly,  from 
the  surface  to  sigma  level  15,  the  functions  tilt  westward 
with  increasing  height  in  a  manner  that  is  qualitatively 
similar  to  a  growing  baroclinic  wave.  Also  note  that 
upper-level  features  lie  a  greater  distance  to  the  west  of 
lower-level  features  at  1200  than  at  1800  UTC.  This 
change  of  vertical  tilt  with  time  is  reminiscent  of  non- 
modal  baroclinic  wave  growth  (Farrell  1989). 
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Fig.  4.  The  AECL  function  for  the  raw  covariance  function  shown  in  Fig.  3  is  shown.  This  localization  function  is 
the  element-wise  square  of  the  correlation  function  of  a  128-member  ensemble  of  smoothed  and  normalized 
streamfunction  fields. 


Figure  4  gives  the  AECL  localization  function  corre¬ 
sponding  to  Fig.  3.  Figure  4a  shows  that  at  the  1800  UTC 
time,  the  0.1  contour  of  the  adaptive  localization  func¬ 
tion  stretches  from  30°  to  50°S  and  from  about  80°  to 
108°E.  Figure  4b  shows  that  the  function  to  moderate 
the  raw  ensemble  covariances  at  1200  UTC  is  similar  to 
that  for  1800  UTC  except  the  pattern  is  shifted  about 
6°W  of  the  1800  UTC  localization  function.  This  move¬ 
ment  is  consistent  though  a  little  slower  than  the  apparent 
movement  of  the  raw  covariance  function  shown  in  Fig.  3. 
Figures  4c, d  show  that  the  adaptive  localization  function 
tilts  westward  with  height  and,  hence,  is  well  suited  to 
preserving  the  westward  tilt  of  the  raw  covariance  func¬ 
tion  shown  in  Fig.  3. 

What  Fig.  4  does  not  show  is  that  the  AECL  locali¬ 
zation  function  has  oscillations  in  the  Northern  Hemi¬ 
sphere  with  isolated  maxima  as  large  as  0.1.  The 
nonadaptive  part  =  WWT  of  the  PAECL  scheme 
was  introduced  to  attenuate  such  far-held  oscillations  of 


the  AECL.  For  this  purpose,  the  localization  functions 
associated  with  =  WWT  were  designed  to  have  no 
vertical  variation  but  a  very  broad  horizontal  variation. 
The  structure  of  the  column  of  =  WWT  corre¬ 
sponding  to  Figs.  3  and  4  that  optimized  the  DA  per¬ 
formance  of  PAECL  is  shown  in  Fig.  5.  Note  that  this 
function  is  broad  enough  to  preserve  the  significant  parts 
of  the  AECL  localization  function  shown  in  Fig.  4  and 
most  of  the  raw  covariance  function  (Fig.  3)  that  is  likely 
to  be  nonspurious  at  both  1800  and  1200  UTC. 

The  full  PAECL  localization  function  (column  of 
Qa  O  c„)  corresponding  to  Fig.  3  is  not  shown  here 
because  it  is  almost  identical  to  that  shown  in  Fig.  4.  This 
is  because  the  nonadaptive  part  (Fig.  5)  is  so  much 
broader  than  the  adaptive  part  (Fig.  4).  As  such,  we  use 
Fig.  4  as  a  proxy  for  the  full  PAECL  localization  func¬ 
tion. 

It  is  of  interest  to  compare  this  proxy  for  the  PAECL 
localization  function  with  the  purely  NECL  localization 
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Fig.  5.  This  is  the  very  broadscale  horizontal  nonadaptive  lo¬ 
calization  function  that  is  used  to  attenuate  spurious  far-held  fea¬ 
tures  of  the  adaptive  localization  function  shown  in  Fig.  4.  Note 
that  this  nonadaptive  localization  function  has  no  vertical  varia¬ 
tion. 


function  shown  in  Fig.  6.  This  nonadaptive  localization 
function  has  been  tuned  to  minimize  the  area-weighted 
average  of  analysis  error.  It  has  a  slightly  reduced  hori¬ 
zontal  scale  relative  to  PAECL  having  a  0.1  contour 
stretching  32°-48°S  and  80°-100°E.  Comparison  of  Figs. 
6b  and  4c,  however,  shows  that  it  has  a  significantly 
smaller  vertical  scale  than  the  PAECL  function.  Note 
that  the  nonadaptive  localization  function  is  the  same  at 
both  1200  and  1800  UTC. 

Figures  7  and  8  give  the  raw  ensemble  covariance 
functions  localized  with  PAECL  and  NECL,  respec¬ 
tively.  Figure  7  shows  that  PAECL  preserves  much  of 
the  positive  central  lobe  and  the  eastern  negative  lobe  of 
the  raw  ensemble  covariance  function  at  both  1800  and 
1200  UTC.  At  both  1800  and  1200  UTC,  PAECL  allows 
the  central  positive  lobe  to  have  a  larger  magnitude  than 
the  eastern  negative  lobe — as  it  is  in  the  raw  covari¬ 
ance  function.  In  contrast,  because  the  1200  UTC  non¬ 
adaptive  localization  function  does  not  lie  to  the  east 
of  the  1800  UTC  nonadaptive  localization  function, 
NECL  forces  the  negative  eastern  lobe  to  have  a  larger 
magnitude  than  the  positive  lobe  at  1200  UTC.  In  ad¬ 
dition,  PAECL  preserves  much  more  of  the  vertical 
structure  of  the  raw  ensemble  covariance  than  NECL. 
However,  we  must  recall  that  the  covariance  functions 
displayed  in  Figs.  7  and  8  pertain  to  a  point  where  the 
ensemble  mean  zonal  wind  is  maximized;  hence,  most 
points  on  the  globe  would  be  in  less  need  of  adaptive 
localization  than  the  point  shown  in  Figs.  7  and  8.  This 


Longitude 


Fig.  6.  As  in  Figs.  4a, c,  but  the  NECL  function  is  shown.  (The 
function  is  the  same  at  1800  and  1200  UTC.) 


may  be  why  we  did  not  find  a  significant  benefit  from 
adaptive  localization. 

While  Figs.  3  to  8  give  insight  into  how  PAECL  can 
accommodate  moving  error  fields,  Fig.  9  compares  and 
contrasts  the  wide  variety  of  3D  localization  function 
structures  provided  by  PAECL  with  those  given  by 
NECL  localization  for  4  points  in  the  Northern  Hemi¬ 
sphere.  Figure  9  shows  that,  on  average,  the  horizontal 
extent  of  the  PAECL  localization  functions  is  similar  to 
that  of  the  NECL  localization  functions  while  the  ver¬ 
tical  extent  of  PAECL  localization  functions  is  greater 
than  that  of  the  NECL  localization  function. 

The  localization  functions  that  minimize  area-averaged 
root-mean-square  analysis  error  in  our  experiments  are 
considerably  narrower  than  those  used  by  Buehner  et  al. 
(2010a, b).  In  their  EnKF  and  (for  the  sake  of  consistency) 
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Fig.  7.  As  in  Fig.  3,  but  the  raw  ensemble  covariance  function  has  been  localized  with  the  PAECL. 


the  various  flavors  of  ensemble  4D-VAR  considered  in 
their  study,  the  horizontal  localization  function  goes  to 
0.2  of  its  maximum  value  1400  km  away  from  the  func¬ 
tion  maximum  and  to  zero  2800  km  away  from  the 
function  maximum.  In  contrast,  Figs.  4  and  6  indicate 
that  both  our  adaptive  and  nonadaptive  localization 
functions  are  below  0.1  of  their  maximum  values  about 
1000  km  away  from  their  respective  function  maxi- 
mums.  The  2800-km  localization  limit  used  in  Buehner 
et  al.  (2010a, b)  was  first  arrived  at  as  part  of  the  study  by 
Mitchell  et  al.  (2002)  on  ensemble  size  and  balance  is¬ 
sues  in  the  Environment  Canada  (EC)  stochastic  EnKF. 
Researchers  at  EC  have  steadily  improved  their  EnKF 
since  that  time  [see  Houtekamer  et  al.  (2009)  for  a  more 
recent  description  of  the  system].  However,  no  intense 
effort  has  been  made  since  Mitchell  et  al.  (2002)  to  retune 
their  EnKF  localization  limit  as  it  was  felt  that  the  few 
experiments  that  have  been  performed  with  differing 
localization  limits  did  not  justify  an  extensive  tuning  ef¬ 
fort  for  this  parameter  (H.  L.  Mitchell  2010,  personal 
communication). 


In  reflecting  on  why  our  optimal  localization  limit  is  so 
different  to  that  obtained  by  Mitchell  et  al.  (2002),  it  is 
worth  recalling  that  their  EnKF  assimilates  observations 
in  sequential  batches.  The  same  localization  function 
is  applied  to  each  batch  regardless  of  how  effectively 
preceding  batches  of  observations  have  removed  error 
from  large  scales.  This  invariance  of  the  localization 
function  to  previous  observations  assimilated  means 
that  the  correction  obtained  from  sequential  assimila¬ 
tion  will  be  somewhat  different  to  that  which  would  be 
obtained  by  assimilating  all  of  the  observations  in  one 
large  batch — as  is  done  in  our  ensemble  4D-VAR.  With 
this  in  mind,  we  speculate  that  the  difference  in  hori¬ 
zontal  correlation  length  scale  may  be  a  result  of  (i) 
fundamental  differences  in  the  optimal  scale  of  locali¬ 
zation  functions  for  ensemble  4D-VAR  and  the  se¬ 
quential  EnKF,  (ii)  the  high  density  and  quasi-isotropic 
nature  of  our  idealized  network  favoring  shorter  local¬ 
ization  length  scales  than  the  more  anisotropic  network 
found  in  the  real  atmosphere,  (iii)  our  area-averaged 
measure  of  analysis  error  giving  more  weight  to  the 
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Fig.  8.  As  in  Fig.  7,  but  the  raw  ensemble  covariance  function  has  been  localized  with  the  NECL. 


accuracy  of  analyses  at  tropical  grid  points  than  the 
various  measures  used  by  Mitchell  et  al.  (2002)  to  tune 
their  localization  length  scale,  and/or  (iv)  McLay  et  al.’s 
(2010)  ensemble  generation  technique  causing  ensem¬ 
ble  perturbations  to  decorrelate  over  shorter  distances 
than  those  of  ensemble  perturbations  generated  by  the 
EnKF. 

5.  Conclusions 

A  new  ensemble  covariance  localization  technique 
that  blends  adaptive  localization  with  nonadaptive  lo¬ 
calization  has  been  presented.  The  method  includes 
nonadaptive  ensemble  covariance  localization  and  purely 
adaptive  localization  as  special  cases.  It  was  shown  how 
a  square  root  theorem  enables  the  scheme  to  be  incor¬ 
porated  into  global  4D  variational  algorithms.  To  dem¬ 
onstrate  the  technique,  both  nonadaptive  and  adaptive 
ensemble  covariance  localization  approaches  were  tuned 
to  minimize  globally  averaged  mean  square  analysis  error 


with  both  6-  and  12-h  data  assimilation  windows.  It  was 
found  that  while  the  adaptive  localization  functions  pro¬ 
duced  showed  a  high  degree  of  adaptability  in  both  space 
and  time,  they  were  not  significantly  better  than  non¬ 
adaptive  localization  functions  at  reducing  forecast  or 
analysis  error. 

Bishop  and  Hodyss  (2007,  2009a)  found  clear  benefits 
from  adaptive  localization  when  the  ensemble  pertur¬ 
bations  were  drawn  from  precisely  the  same  distribution 
as  the  distribution  of  forecast  errors.  In  our  experiment, 
the  error-ensemble  mismatch  was  modeled  by  using 
distinctly  different  methods  to  create  a  pseudo-first-guess 
error  and  the  ensemble  perturbations.  It  is  possible  that 
the  mismatch  created  by  our  approach  is  greater  than  that 
between  actual  forecast  error  and  ensemble  perturba¬ 
tions  from  state-of-the-art  ensemble  generation  schemes. 
If  so,  then  the  potential  benefit  of  adaptive  localization  is 
greater  than  that  suggested  by  our  results. 

Other  possible  reasons  for  the  failure  of  adaptive  lo¬ 
calization  to  show  benefit  in  our  experiments  include  the 
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Fig.  9.  Comparison  of  structure  of  optimally  tuned  (a)  adaptive  and  (b)  nonadaptive  localization  functions  at 
1200  UTC  for  4  grid  points  in  the  Northern  Hemisphere  all  situated  at  the  same  latitude  (45°N)  and  cr-level  15  (about 
400  hPa),  but  all  separated  by  90°  of  longitude  (0°,  90°,  180°,  270°E)  and  45°N  at  model  <r-level  15  (about  400  hPa). 
The  corresponding  vertical  structure  of  the  (c)  adaptive  and  (d)  nonadaptive  localization  functions  along  the  45°N 
latitude  circle. 


possibility  that  (i)  the  movement  of  error  correlation 
functions  over  6-  and  12-h  data  assimilation  windows  is 
not  large  enough  for  adaptive  localization  to  show  a  clear 
advantage,  and/or  (ii)  the  spatial  inhomogeneity  of  the 
true  error  correlation  functions  is  not  large  enough  for 
adaptive  localization  to  show  a  benefit.  If  the  primary 
reason  for  the  neutrality  of  our  results  is  short  error 
propagation  distances  and  weak  spatial  inhomogeneity  of 
the  true  error  correlation  length  scale  over  6-1 2-h  DA 
windows,  then  the  neutrality  of  our  results  is  unlikely  to 
change  with  an  improved  ensemble  model  of  the  forecast 
error  distribution. 

The  single  case  studied  here  is  not  enough  to  reveal 
whether  or  not  there  is  a  small,  but  significant  difference 
between  adaptive  and  nonadaptive  localization  in  our 
idealized  system.  However,  because  of  the  somewhat 
contrived  nature  of  our  first-guess  error  and  the 


mismatch  of  our  ensemble  covariances  with  this  first- 
guess  error,  identifying  a  small  but  significant  difference 
from  repeated  experiments  would  be  of  little  relevance 
to  the  design  of  candidate  operational  ensemble  DA 
schemes. 

Future  research  will  be  aimed  at  incorporating  both 
adaptive  and  nonadaptive  ensemble  covariances  into  the 
Navy’s  operational  4D-VAR  scheme  (NAVDAS-AR) 
and  then  performing  cycling  data  assimilation  and  fore¬ 
casting  experiments.  Hopefully,  these  experiments  will 
reveal  significant  (but  probably  small)  differences  in  the 
performance  of  adaptive  and  nonadaptive  localization. 
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APPENDIX 

Creating  Smooth  Normalized  Perturbations 

The  smoothed  ensemble  members  are  obtained  from 
the  following  steps: 

1)  Create  ensemble  perturbations  that  are  normalized 
by  their  ensemble  standard  deviation;  that  is,  if  xqtq 
denotes  the  perturbation  of  the  qth  variable  type  at 
the  rth  output  time  at  the  yth  model  grid  point  of  the 
ith  ensemble  member,  normalize  this  variable  using 

*ijtq  ~  Xijtq/]fl^A^i  =  i (xijtq)  • 

2)  Smooth  over  variable  types  by  performing  a  weighted 
average  over  variable  type  using  xqtq  =  ^^=1wqq  ^qtq, , 
where  wqq  =  l.  Let  the  vector  of  variable  smoothed 
ensemble  perturbations  resulting  from  this  procedure 
be  denoted  by  x-.  Considerable  time  can  be  saved 
by  using  univariate  localization  functions.  Univariate 
adaptive  localization  is  obtained  by  letting  wq=i  =  wq=  ■ 
for  any  variable  type  i  and  j. 

3)  Smooth  the  variables  x-  over  space  to  obtain  x-  by 
first  smoothing  in  the  horizontal  and  then  in  the 
vertical  or  vice  versa. 

If  x\-tq  denotes  the  element  of  x£  corresponding  to  the 
qth  variable  type  at  the  tth  output  time  at  the  yth  model 
grid  point  of  the  ith  ensemble  member,  normalize  this 
variable  using  z\jtq  =  [xsijt?/'Ll1(xsijtqf].  The  vector  zf 
listing  the  variables  obtained  through  this  procedure 
gives  the  ith  column  of  the  smoothed  normalized  en¬ 
semble  matrix  Zs  which  makes  ZsZj  =  be  a 

correlation  matrix. 

In  this  paper,  we  used  a  univariate  localization  based 
solely  on  the  streamfunction;  in  other  words,  for  all 
values  of  we  set  =  0  if  q'  indicated  a  non- 
streamf unction  variable,  but  wqq  =  1  if  q’  indicated  the 
streamfunction  variable.  We  also  used  a  slight  variation 
of  the  above  generalized  approach  in  which  the  part  of 
z-  associated  with  the  terrain  pressure  ps  was  given  by 
the  smoothed  and  normalized  value  of  the  stream- 
function  on  the  model’s  fifth-lowest  vertical  level  (ap¬ 
proximately  900  mb).  We  did  not  use  the  lowest  level 
because  streamfunction  tends  to  zero  at  the  lowest  model 
level. 
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